Rank | Count | Beginning |
---|---|---|
1966 | 744 | ਇਸ |
2882 | 505 | ਇਹ |
642 | 335 | ਉਹ |
50 | 326 | ਉਸ |
6864 | 213 | ਦੇ |
9704 | 175 | ਵਿੱਚ |
365 | 163 | ਉਸਨੇ |
3468 | 107 | ਇੱਕ |
1414 | 101 | ਅਤੇ |
6679 | 99 | ਦੀ |
6577 | 72 | ਦਾ |
5792 | 65 | ਜਦੋਂ |
891 | 61 | ਉਹਨਾਂ |
1123 | 61 | ਉਨ੍ਹਾਂ |
3230 | 56 | ਇਹਨਾਂ |
3679 | 56 | ਇਨ੍ਹਾਂ |
7401 | 56 | ਨੇ |
7580 | 56 | ਪਰ |
7918 | 53 | ਪੰਜਾਬੀ |
7739 | 52 | ਪਿੰਡ |
6455 | 50 | ਤੋਂ |
7353 | 47 | ਨੂੰ |
8297 | 43 | ਬਾਅਦ |
8104 | 40 | ਫਿਰ |
8579 | 40 | ਭਾਰਤ |
2418 | 37 | ਇਸਦੇ |
7261 | 35 | ਨਾਲ |
1784 | 34 | ਆਪਣੇ |
3617 | 34 | ਇੱਥੇ |
4692 | 32 | ਹਾਲਾਂਕਿ |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV